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Abstract 

One recurring problem in program development is that of understanding how to re-use 
code developed by a third party. In the context of (constraint) logic programming, part of 
this problem reduces to figuring out how to query a program. If the logic program does not 
come with any documentation, then the programmer is forced to either experiment with 
queries in an ad hoc fashion or trace the control-flow of the program (backward) to infer 
the modes in which a predicate must be called so as to avoid an instantiation error. This 
paper presents an abstract interpretation scheme that automates the latter technique. 
The analysis presented in this paper can infer moding properties which if satisfied by the 
initial query, come with the guarantee that the program and query can never generate any 
moding or instantiation errors. Other applications of the analysis are discussed. The paper 
explains how abstract domains with certain computational properties (they condense) can 
be used to trace control-flow backward (right-to-left) to infer useful properties of initial 
queries. A correctness argument is presented and an implementation is reported. 



1 Introduction 

The myth of the lonely logic programmer writing a program in isolation is just 
that: a myth. Applications (and application components) are usually implemented 
and maintained by a team. One consequence of this is a significant proportion of 
the program development effort is devoted to understanding code developed by 
another. One advantage of (constraint) logic programs for software development is 
that their declarative nature makes them less opaque than, say, CH — h programs. One 
disadvantage of logic programs over C++ programs, however, is that the signature 
(argument types) of a predicate do not completely specify how the predicate should 
be invoked. In particular, a call to a predicate from an unexpected context may 
generate an error if an argument of the call is insufficiently instantiated (even if 
the program and query are well- typed). This is because logic programs contain 
builtins and calls to these builtins often impose moding requirements on the query. 
If the program is developed by another programmer, it may not be clear how to 
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query a predicate so as to avoid an instantiation error. In these circumstances, the 
programmer will often resort to a trial and error tactic in their search for an initial 
call mode. This can be both frustrating and tedious and, of course, cannot guarantee 
coverage of all the program execution paths. This paper presents an analysis for 
inferring moding properties which, if satisfied by the initial query, ensure that the 
program does not generate instantiation errors. Of course, it does not mean that 
the inferred call has the form exactly intended by the original programmer - no 
analysis can do that - the analysis just recovers mode information. Nevertheless, 
this is a useful first step in understanding the code developed by another. 

The problem of inferring initial queries which do not lead to instantiation errors 
is an instance of the more general problem of deducing how to call a program 
so that it conforms to some desired property, for example, calls to builtins do not 
error, the program terminates, or calls to builtins behave predictably. The backward 
analysis presented in this paper is designed to infer conditions on the query which, 
if satisfied, guarantee that resulting derivations satisfy a property such as one of 
those above. Specifically, the analysis framework can be instantiated to solve the 
following analysis problems: 



Builtins and library functions can behave unpredictably when called with 
infinite rational trees. For example, the query ?- X = X + X, Y is X will 
not terminate in SICStus Prolog because the arithmetic operator expects its 
input to be a finite tree rather than an infinite rational tree. Moreover, the 
standard term ordering of Prolog does not lift to rational trees, so the builtin 
sort can behave unpredictably when sorting rational trees. These problems 
(and related problems with builtins) motivate the use of dependency analysis 
for tracking which terms are definitely finite ( Bagnara et al., 2001 ). The basic 
idea is to describe the constraint x — f(x±, . . . ,x n ) by the Boolean function 
x <^> AjLjXj which encodes that x is bound to a finite tree iff each Xi is 
bound to a finite tree. Although not proposed in the context of backward 
analysis (Bagnara et al., 2001), the framework proposed in this paper can be 
instantiated with a finite tree dependency domain to infer finiteness properties 
on the query which, if satisfied, guarantee that builtins are not called with 
problematic arguments. 

Termination inference is the problem of inferring initial modes for a query 
that, if satisfied, ensure that a logic program terminates. This problem gener- 
alises termination checking which verifies program termination for a class of 



queries specified by a given mode. Termination inference dates back to (Mes- 



nard, 1996) but it has been recently observed (Genaim & Codish, 2001) that 



the missing link between termination checking and termination inference is 
backward analysis. A termination inference analyser is reported in (Genaim 



& Codish, 2001) composed from two components: a standard termination 



checker (Codish & Taboch, 1999) and the backward analysis described in this 



paper. The resulting analyser is similar to the cTI analyser of (Mesnard & 



Neumerkel, 2001) - the main difference is its design as two existing black- 
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Codish, 2001), simplifies the 



box components which, according to (Gcnaim , 
formal justification and implementation. 

Mode analysis is useful for implementing ccp programs. In particular (Debray 



et at, 1992) explains how various low-level optimisations, such as returning 



output values in registers, can be applied if goals can be scheduled left-to- 
right without suspension. If the guards of the predicates are re-interpreted as 
moding requirements, then the backward mode analysis can infer sufficient 
conditions for avoiding deadlock under left-to-right scheduling. The analysis 
presented in this paper thus has applications outside program development. 

To summarise, the analysis presented in this paper can deduce properties of the call 
which, if satisfied, guarantee that resulting derivations fulfill some desired property. 



The analysis is unusual in that it applies lower approximation (see 2.4.1) as well 



as upper approximation (see 2.3.1); it is formulated in terms of a greatest fixpoint 



calculation (see 2.4) as well as least fixpoint calculation (see 2.3); the analysis also 
imposes some unusual restrictions on the abstract domain (see 2.4.6). 



1.1 Backward analysis 



Backward analysis has been applied extensively in functional programming in, 
among other things, projection analysis ( Wadler fc Hughes, 1987 ), stream strict- 
ness analysis ( Hall fc Wise, 1989 ), inverse image analysis ( Dyber, 1991 ), etc. By 



reasoning about the context of a function application, these analyses can identify 
opportunities for eager evaluation that are missed by (forward) strictness analysis 



as proposed by (Mycroft, 1981). Furthermore, backward reasoning on imperative 
programs dates back to the early days of static analysis ( Cousot fc Cousot, 1982| ). 
By way of contrast, backward analysis has been rarely applied in logic program- 



ming. One notable exception is the demand analysis of (Debray, 1993). This analysis 
infers the degree of instantiation necessary for the guards of a concurrent constraint 
program (ccp) to reduce. It is a local analysis that does not consider the possible 
suspension of body calls. This analysis detects those (uni- modal) predicates which 
can be implemented with specialised suspension machinery. A more elaborate back- 
ward analysis for ccp is presented by ( Falaschi et ai, 20"00| ) . This demand analysis 
infers how much input is necessary for a procedure to generate a certain amount of 
output. This information is useful for adding synchronisation (ask) constraints to a 
procedure to delay execution and thereby increase grain size, and yet not introduce 
deadlock. (Section |t] provides more extensive and reflective review of the related 
work.) 



1 . 2 Contributions 

Our work is quite different. As far as we are aware, it is unique in that it focuses on 
the backward analysis of (constraint) logic programs with left-to-right scheduling. 
Specifically, our work makes the following practical and theoretical contributions: 

• it shows how to compute an initial mode of a predicate which is safe in that 
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if a query is at least as instantiated as the inferred mode, the execution is 
guaranteed to be free from instantiation errors. The modes inferred are often 
disjunctive, sometimes surprising and, for the small predicates that we verified 
by hand, appear to be optimal. 

• it specifies a practical algorithm for calculating initial modes that is straight- 
forward to implement in that it reduces to two bottom-up fixpoint calcula- 
tions. Furthermore, this backward analysis problem cannot be solved with 
any existing abstract interpretation machinery. 

• to the best our knowledge, it is the first time domains that are closed un- 
der Heyting completion flGiacobazzi fc Scozzari, 199§| ), or equivalently are 
condensing (Marriott & S0ndergaard, 1993), have been applied to backward 
analysis. Put another way, our work adds credence to the belief that conden- 
sation is an important property in the analysis of logic programs. 

The final point requires some unpacking. Condensation was originally proposed in 



(Langen, 1991), though arguably the simplest statement of this property (Marriott 



& S0ndergaard, 1993) is for downward closed domains such as Pos (Armstrong 
et al, 1998| ) and the Pos-like type dependency domains (Codish & Lagoon, 200C) 



Suppose that / : X — » X is an abstract operation on a downward closed domain 
X equipped with an operation A that mimics unification or constraint solving. X 
is condensing iff x A f(y) — f(x A y) for all x,y G X. Hence, if X is condens- 
ing, x A f{true) = f(x) where true represents the weakest abstract constraint. 
More exactly, if f(true) represents the result of the goal-independent analysis, and 
f{x) the result of the goal-dependent one with an initial constraint x, then the 
equivalence f(x) = x A f(true) enables goal-dependent analysis to be performed 
in a goal-independent way without loss of precision. This, in turn, can simplify 
the implementation of an analyser ( Armstrong et at, 199S ). Because of this, do- 
main refinement machinery has been devised to enrich a domain with new elements 
to obtain the desired condensing property ( Giacobazzi fc Scozzari, 1998 ). It turns 
out that it is always possible to systematically design a condensing domain for a 
given downward closed property ( Giacobazzi fc Scozzari, 1998| ) [Theorem 8.2] by 
applying Heyting completion. Conversely, under some reasonable hypotheses, all 



condensing domains can be reconstructed by Heyting completion ( Giacobazzi fc 



Scozzari, 1998 ([Theorem 8.3]. One consequence of this is that condensing domains 



come equipped with a (pseudo-complement) operator and this turns out to be an 
operation that is important in backward analysis. To summarise, machinery has 
been developed to synthesise condensing domains and condensing domains provide 
operations suitable for backward analysis. 



1.3 Organisation of the paper 

The rest of the paper is structured as follows. Section || introduces the key ideas 
of the paper in an informal way through a worked example. Section [| introduces 
the necessary preliminaries for the formal sections that follow. Section |^ presents 
an operational semantics for constraint logic programs with assertions in which the 
set of program states is augmented by a special error state. Section || develops a 
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semantics which computes those initial states that cannot lead to the error state. 
The semantics defines a framework for backward analysis and formally argues cor- 
rectness. Section ^ describes an instantiation of the framework for mode analysis. 
Section ^ reviews the related work a nd section [8] concludes. Much of the formal 



machinery is borrowed directly from (Giacobazzi et al., 1995; Giacobazzi & Scoz 
zari, 199§| ) and in particular the reader is referred to ( piacobazzi et at, 1995 ) for 



proofs of the semantic results stated in section || (albeit presented in a slightly dif- 
ferent form). To aid continuity in the paper, the remaining proofs are relegated to 
appendix [a|. 



2 Worked example 

2.1 Basic components 

This section informally presents an abstract interpretation scheme which infers how 
to query a given predicate so as to avoid run-time moding errors. In other words, 
the analysis deduces moding properties of the call that, if satisfied, guarantee that 
resulting derivations cannot encounter an instantiation error. To illustrate, consider 
the Quicksort program listed in the left column of figure 0. This is the first ingredient 
of the analysis: the input program. The second ingredient is an abstract domain 
which, in this case, is Pos. Pos is the domain of positive Boolean functions, that is, 
the set of functions / : {0, 1}" — » {0, 1} such that /(l, . . . , 1) = 1. Hence xVy E Pos 
since 1 V 1 = 1 but -ix £ Pos since — il = 0. Pos is augmented with the bottom 
element with 1 being the top element. The domain is ordered by entailment |= 
and, in this example, will be used to represent grounding dependencies. 

Pos comes equipped with the logical operations: conjunction A, disjunction V, 
implication =>■ (and thus bi- implication Conjunction is used to conjoin the 
information from different body atoms, while disjunction is used to combine the 
information from different clauses. Conjunction and disjunction, in turn, enable two 
projection operators to be defined: 3 X (/) = f[x i— ► 0] V f[x t— *• 1] and V x (/) = /' if 
/' £ Pos otherwise V a (/) = where /' — f[x t-> 0] A f[x i-> 1]. Note that although 
f[x i — ► 0] V j\x i — ► 1] G Pos for all / € Pos it does not follow that f[x i-> 0] A f[x h-» 
1] e Pos for all / e Pos. Indeed, (x •<= y)[x i-> 0] A (x <= y)[x ^ 1] = Both 
operators are used to project out the body variables that are not in the head of 
a clause. Specifically, these operators eliminate the variable x from the formula /. 
They are dual in the sense that V x (f) \= I \= ^ x (f). These are the basic components 
of the analysis. 



2.2 Normalisation and abstraction 

The analysis components are assembled in two steps. The first is a bottom-up anal- 
ysis for success patterns, that is, a bottom-up analysis which infers the groundness 
dependencies which are known to be created by each predicate regardless of the 
calling pattern. This step is a least fixpoint (lfp) calculation. The second step is a 
bottom- up analysis for input modes (the objective of the analysis). This step is a 
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qs( [] , s, s) . 

qs ( [m|a;s] , s, t) : - 
pt (xs, m, I, ft) , 
qs(l, s, [m|r]), 
qs (ft, T , t) . 

pt([], _, [], []). 

pt([s|a;s], m, lx\l] , ft) 

m =< x, 

pt (.xs, m, I, ft) . 
pt([x-|a;s], m, Z, [:r|ft] ) 

m > x, 

pt (.xs, m, I, ft) . 



qs(ii,s,£2) 

ti = [] , t 2 = s. 
qs(ti,s,t) :- 

ti = [m|xs] , 

t-i = [m|r] , 

pt (xs, m, I, ft) , 

qs(Z, s, ts) , 

qs(h,r,t) . 

pt(ti, -,t2,t3~) :- 
*i = 

fa = [] , *3 = [] • 

pt(t 1 ,m,t 2 ,h) :- 

ii = [x|a;s] , 

t 2 = , 

m =< x 

pt (a;s, m, Z, ft) . 
pt(ii, m, i,*2) :- 

t\ = \_x\xs~] , 

t2 = Lx\h1 , 

m > x, 

pt (xs, m, I, ft) . 



qs(ti,s,i2) :- 

qs(ti,s,t) :- 
10 52. 

pt (a;s, m, I, ft) , 
qs(Z,s,t3) , 
qs(ft, r,t) . 

pt(ti, _,t2,ts) :- 

pt(ti,m,t 2 ,ft) :- 

=<' (m, x) , 
pt (a;s, m, I, ft) . 
ptfti, m, l,ti) :- 
logs, 
>'(m, x) , 
pt (zs, m, Z, ft) . 

=<' (m, a;) :- g 6 og 6 . 

>' (m, x) :- geog 6 . 



Fig. 1. Quicksort: raw, normalised and abstracted 



greatest fixpoint (gfp) computation. To simplify both steps, the program is put into 
a form in which the arguments of head and body atoms are distinct variables. This 
gives the normalised program listed in the centre column of figure |l} This program 
is then abstracted by replacing each Herbrand constraint x = /(xi, . . . , x n ) with 
a formula x /\™ =1 Xi that describes its grounding dependency. This gives the ab- 
stract program listed in the right column of figure 0. The formula 1 in the assertion 
represents true whereas the formulae gi that appear in the abstract program are 
as follows: 

.91 = t\ A (t 2 <=> s) 54 = ti (x A xs) A t 2 <f> (x A I) 

92 = t\ ^ (rn A xs) A t 3 44> (m A r) g 5 = t x ^ (x A xs) A t 2 ^ (x A h) 
g 3 = h A t 2 A t 3 g 6 = to A x 

Builtins that occur in the source, such as the tests =< and >, are handled by 
augmenting the abstract program with fresh predicates, =<' and >', which express 
the grounding behaviour of the builtins. The o symbol separates an assertion (the 
required mode) from another Pos formula describing the grounding behaviour of a 
successful call to the builtin (the success mode). For example, the formula g$ left 
of o in the =<' clause asserts that the =< test will error if its first two arguments 
are not ground, whereas the g§ right of o describes the state that holds if the 
test succeeds. These formulae do not coincide for all builtins (see Table |l|). For 
quicksort, the only non-trivial assertions arise from builtins. This would change if 



the programmer introduced assertions for verification (Pucbla et ai, 2000a) 
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2.3 Least fixpoint calculation 

An iterative algorithm is used to compute the lfp and thereby characterise the 
success patterns of the program. A success pattern is a pair consisting of an atom 
with distinct variables for arguments paired with a Pos formula over those variables. 
Renaming and equality of formulae induce an equivalence between success patterns 
which is needed to detect the fixpoint. The patterns (p(u, w, v), u A (w <^> v)) and 
(p(x\,X2,Xs),(x3 x 2 ) A x\), for example, are considered to be identical: both 
express the same inter-argument groundness dependencies. Each iteration produces 
a set of success patterns: at most one pair for each predicate in the program. 



2. 3. 1 Upper approximation of success patterns 

A success pattern records an inter-argument groundness dependency that describes 
the binding effects of executing a predicate. If {p(x),f) correctly describes the 
predicate p, and g holds whenever / holds, then (p(x),g) also correctly describes 
p. Success patterns can thus be approximated from above without compromising 
correctness. 

Iteration is performed in a bottom-up fashion and commences with F = 0. Fj+i 
is computed from Fj by considering each clause p(x) <— do f,pi(x*i), . . . ,p n {x n ) in 
turn. Initially Fj + i = 0. The success pattern formulae /, for the n body atoms 
are conjoined with / to obtain g = f A /\f =1 fi- Variables not present in p(x), Y 
say, are then eliminated from g by computing g' = 3y(g) (weakening g) where 
i/ n }(<?) = . . 3y n (g)). Weakening g does not compromise correctness be- 
cause success patterns can be safety approximated from above. 



2.3.2 Weakening upper approximations 

If Fj + i already contains a pattern of the form (p(x),g"), then this pattern is re- 
placed with {p(x),g' V g"), otherwise Fj +1 is revised to include {p(x),g'). Thus the 
success patterns become progressively weaker on each iteration. Again, correctness 
is preserved because success patterns can be safety approximated from above. 



2.3.3 Least fixpoint calculation for Quicksort 



For brevity, let u— (xi,x 2 ), v — (xi,x 2 ,x 3 ) and w = (x\, x 2 , x 3 , x±). Then the lfp 
for the abstracted Quicksort program is obtained (and checked) in the following 3 
iterations: 



(qs(w),xi A (x 2 <^ x 3 )}' 
(pt(uJ), X\ A x 3 A X4) 
{=<'(u),xi Ax 2 ) 
{>'(u),xi Ax 2 ) 



F 2 



(qs(w),x 2 <^ (xi Ax 3 )}' 
(pt(wJ), xi A x 3 A X4) 
(=<'(«),HAi 2 ) 
(>'(u),xi Ax 2 ) 



Finally, F 3 = F 2 . The space of success patterns forms a complete lattice which 
ensures that a lfp (a most precision solution) exists. The iterative process will 
always terminate since the space is finite and hence the number of times each 
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success pattern can be updated is also finite. Moreover, it will converge onto the 
lfp since iteration commences with the bottom element Fo = 0. 

Observe that F 2 , the lfp, faithfully describes the grounding behaviour of quick- 
sort: a qs goal will ground its second argument if it is called with its first and third 
arguments already ground and vice versa. Note that assertions are not considered 
in the lfp calculation. 

2-4 Greatest fixpoint calculation 

A bottom-up strategy is used to compute a gfp and thereby characterise the safe 
call patterns of the program. A safe call pattern describes queries that do not violate 
the assertions. A call pattern has the same form as a success pattern (so there is one 
call pattern per predicate rather than one per clause). One starts with assuming no 
call causes an error and then checks this assumption by reasoning backwards over 
all clauses. If an assertion is violated, the set of safe call patterns for the involved 
predicate is strengthened (made smaller), and the whole process is repeated until 
the assumptions turn out to be valid (the gfp is reached). 

2.4-.1 Lower approximation of safe call patterns 

Iteration commences with D = {{p(x), 1} | p 6 11} where II is the set of predicate 
symbols occurring in the program. An iterative algorithm incrementally strengthens 
the call pattern formulae until they only describe queries which lead to computa- 
tions that satisfy the assertions. Note that call patterns describe a subset (rather 
than a superset) of those queries which are safe. Call patterns are thus lower ap- 
proximations in contrast to success patterns which are upper approximations. Put 
another way, if (p(x), g) correctly describes some safe call patterns of p, and g holds 
whenever / holds, then (p(x),f) also correctly describes some safe call patterns 
of p. Call patterns can thus be approximated from below without compromising 
correctness (but not from above). 

Dk+\ is computed from Dk by considering each p(x) <— do f,p\(x\), . . . ,p n (x n ) 
in turn and calculating a formula that characterises its safe calling modes. Initially 
set .Dfc+i = Dk- A safe calling mode is calculated by propagating moding require- 
ments right-to- left by repeated application of the logical operator =>. More exactly, 
let fi denote the success pattern formula for Pi(xi) in the previously computed lfp 
and let di denote the call pattern formula for Pi(xi) in Dk- Set e„+i = 1 and then 
compute e,i = di h {fi => e i+ i) for 1 < i < n. Each ej describes a safe calling mode 
for the compound goal p%{xi), . . ■ ,p n (x„). 

2.4.2 Intuition and explanation 

The intuition behind the symbolism is that di represents the demand that is already 
known for Pi(xi) not to error whereas e, is di possibly strengthened with extra 
demand so as to ensure that the sub-goal pj+i(x*j+i), . . . ,p n {x n ) also does not error 
when executed immediately after Pi{x{). Put another way, anything larger than di 
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may possibly cause an error when executing Pi(x*i) and anything larger than ej may 
possibly cause an error when executing pi(xi), . . . ,p n (x n ). 

The basic inductive step in the analysis is to compute an ej which ensures that 
Pi(xi), . . . ,p n (x n ) does not error, given di and e.;+i which respectively ensure that 
Pi(xi) and pi+\{xi+\), . . . ,p n (x n ) do not error. This step translates a demand after 
the call to Pi(xi) into a demand before the call to pi (xi ) . The tactic is to set e n+ 1 = 1 
and then compute ej = di A (fi => ej+i) for i < n. This tactic is best explained 
by unfolding the definitions of e n , then e n —i, then e n _2, and so on. This reverse 
ordering reflects the order in which the e; are computed; the ej are computed whilst 
walking backward across the clause. Any calling mode is safe for the empty goal 
and hence e n +i — 1. Note that e„ = d„ A (/„ => e n +i) = d„ A VI) = d n . Hence 
e n represents a safe calling mode for the goal p n (x n ). 

Observe that ej should not be larger than di, otherwise an error may occur 
while executing pi(xi). Observe too that if Pi(xi), . . . ,p n (x n ) is called with a mode 
described by di, then pi + i(xi+i), . . . ,p n (x n ) is called with a mode described by 
(di A fi) since fi describes the success patterns of Pi(xi). The mode (di A fi) may 
satisfy the ej+i demand. If it does not, then the minimal extra demand is added 
to (di A fi) so as to satisfy e^+i. This minimal extra demand is ((di A fi) => e;+i) - 
the weakest mode that, in conjunction with (di A fi), ensures that e^+i holds. Put 
another way, ((di A /<) => e i+1 ) = V{/ G Pos \ (di A /,;) A / |= e i+1 }. 

Combining the requirements to satisfy Pi(xi) and then pi + i(xi+i), . . . ,p n (x n ), 
gives Ci = di A ((di A fi) ==> e,+i) which reduces to ej — di A (fi => e^+i) and 
corresponds to the tactic used in the basic inductive step. 



2-4-3 Pseudo- complement 

This step of calculating the weakest mode that when conjoined with di A fi im- 
plies ej+x, is the very heart of the analysis. Setting e, = would trivially achieve 
safety, but ej should be as weak as possible to maximise the class of safe queries 
inferred. For Pos, computing the weakest reduces to applying the =>• operator, 
but more generally, this step amounts to applying the pseudo-complement operator. 
The pseudo-complement operator (if it exists for a given abstract domain) takes, 
as input, two abstractions and returns, as output, the weakest abstraction whose 
conjunction with the first input abstraction is at least as strong as the second input 
abstraction. If the domain did not possess a pseudo-complement, then there is not 
always a unique weakest abstraction (whose conjunction with one given abstraction 
is at least as strong as another given abstraction). 



To see this, consider the domain Def (Armstrong et al, 1998) which does not 
possess a pseudo-complement. Def is the sub-class of Pos that is definite (Arm- 
fctrong et al, 199§| ) . This means that Def has the special property that each of its 
Boolean functions can be expressed as a (possibly empty) conjunction of propo- 
sitional Horn clauses. As with Pos, Def is assumed to be augmented with the 
bottom element 0. Def can thus represent the grounding dependencies x Ay, x, 
x y, y, x <= y, x =>■ y, and 1 but not x V y. Suppose that di A fi = (x y) 
and e i+ i — (x Ay). Then conjoining x with di A fi would be at least as strong 



10 



Andy King and Lunjin Lu 



as e i+ i and symmetrically conjoining y with di A /, would be at least as strong 
as e i+ i. However, Def does not contain a Boolean function strictly weaker than 
both x and y, namely x V y, whose conjunction with di A fi is at least as strong 
as ej+i. Thus setting ej = x or ej = y would be safe but setting ej = (a; V y) is 
prohibited because xW y falls outside Def . Moreover, setting a = would loose 
an unacceptable degree of precision. A choice would thus have to be made between 
setting a — x and ej = y in some arbitrary fashion, so there would be no clear 
tactic for maximising precision. 

Returning to the compound goal Pi(x*i), . . . ,p n (x n )i a call described by the mode 
di A ((di A fi) => ej+i) is thus sufficient to ensure that neither Pi(xi) nor the sub- 
goal p i+1 (x i+ i), . . . ,p n (x n ) error. Since d t A ((di A fi) => e i+ i) = rf, A (/, e i+ i) 
= ej it follows that Pi(x*i), . . . ,p n (x n ) will not error if its call is described by e^. In 
particular, it follows that e\ describes a safe calling mode for the body atoms of 
the clause p(x) <— do f,p\(x\), . . . ,p n (x n ). 

The next step is to calculate g = d A (f ei). The abstraction / describes 
the grounding behaviour of the Herbrand constraint added to the store prior to 
executing the body atoms. Thus (/ =>■ ei) describes the weakest mode that, in 
conjunction with /, ensures that e\ holds, and hence the body atoms are called 
safely. Hence d A (f =>■ e\) represents the weakest demand that both satisfies the 
body atoms and the assertion d. One subtlety which relates to the abstraction 
process, is that d is required to be a lower-approximation of the assertion whereas 
/ is required to be an upper- approximation of the constraint. Put another way, 
if the mode d describes the binding on the store, then the (concrete) assertion is 
satisfied, whereas if the (concrete) constraint is added to the store, then the store 
is described by the mode /. Table 1 details how to abstract various builtins for 
groundness for a declarative subset of ISO Prolog. 

2.4-4 Strengthening lower approximations 

Variables not present in p(x), Y say, are then eliminated by g' = Vy(g') (strength- 
ening g) where V{ yi ... yn }(g) = Vyi(- ■ -V a „(g)). A safe calling mode for this partic- 
ular clause is then given by g' . Eliminating variables from g by strengthening g is 
unusual and initially appears strange. Recall, however, that call patterns can be 
approximated from below without compromising correctness (but not from above). 
In particular the standard projection tactic of computing ^{ yi ... Vn }(g) would result 
in an upper approximation of g that possibly describes a larger set of concrete call 
patterns which would be incorrect. The direction of approximation thus dictates 
that eliminating the variables Y from g must strengthen g. Indeed, g holds whenever 
V Vi (g) holds and therefore g holds whenever ^{ Vl ...y n }(g) holds as required. 

Dk+i will contain a call pattern (p(x),g") and, assuming g' A g" ^ g" , this is 
updated with (p(x),g' A g"). Thus the call patterns become progressively stronger 
on each iteration. Correctness is preserved because call patterns can be safely ap- 
proximated from below. The space of call patterns forms a complete lattice which 
ensures that a gfp exists. In fact, because call patterns are approximated from be- 
low, the gfp is the most precise solution, and therefore the desired solution. (This 
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contrasts to the norm in logic program analysis where approximation is from above 
and the lfp is the most precise solution). Moreover, since the space of call patterns 
is finite, termination is assured. In fact, the scheme will converge onto the gfp since 
iteration commences with the top element D — {(p(x), 1) | p G II}. 



2.4-5 Greatest fixpoint calculation for Quicksort 
Under this procedure Quicksort generates the following Dk sequence: 



Do = I 



Do 



(pt(«J),l) 
(=<>(u), 1} 

(>'(«),!), 
(qs(v),l) 

(pt(w),x 2 A (xi V (x 3 A x 4 ))) 
(=<'(u),xi Ax 2 ) 
(>'(u),xi Ax 2 ) 



Di 



> D 3 = { 



(qs(«), 1} 

(pt(wJ),l) 
[=<' (u),x\ A x 2 ) 

(>'(u),xi Ax 2 ) J 

(qs(v),xi) 

(pt(wJ), x 2 A (xi V (x 3 A Xi))) 
{=<'(u),X! Ax 2 ) 

{>'(u),x 1 Ax 2 ) 



These calculations are non-trivial so consider how D 2 is obtained from D\ by ap- 
plying the clause pt(*i,m, t 2 ,h) : — log4, =<' (m, x),pt(xs, m, Z, ft). The following 
ei and g formulae are generated: 

e 3 = 1 

e 2 = 1 A ((xs A I A ft) => 1) = 1 

ei = (to A x) A ((to A x) =>■ 1) = m Ax 

g = 1 A (((ii OiAis)A(t 2 «sA I)) => (m A a;)) 

To characterise those pt(*i, to, *2, ft) calls which are safe, it is necessary to compute 
a function g' on the variables ti, to, t 2 , ft which, if satisfied by the mode of a call, 
ensures that g is satisfied by the mode of the call. Put another way, it is necessary to 
eliminate the variables x, xs and / from g (those variables which do not occur in the 
head pt(ti, to, t 2 ,h)) to strengthen g obtain a function g' such that g holds whenever 
g' holds. This is accomplished by calculating g' — V;V XS V X (<7). First consider the 
computation ofV x (g): 



g[x i-» 0] = (((ii <^ x A xs) A (t 2 x A I)) 
= ((ti O0A xs) A(t 2 »0A I)) = 
= (-.ii A -i* 2 ) => 
= *i V * 2 



» (m A a;)) [a; 0] 
(to AO) 



5 [a; i-> 1] = (((*i O a; A is) A (t 2 O i A Z)) => (to A x))[x ^ 1] 
= ((*i <^> xs) A (* 2 «>I))^m 



Since g[x i— > 0] A g[x i— > 1] e Pos it follows that: 



Vx(.g) = (((*i xs) A (* 2 0) m ) A (*i v k) 



12 



Andy King and Lunjin Lu 



(otherwise V K (.g) would be set to 0). Eliminating the other variables in a similar 
way we obtain: 

V XS V X (5) = ((< 2 /) =► m) A (ti V i 2 ) 
= V ; V :rs V a; (.g) = m A (ti V i 2 ) 

Observe that if V;V KS V K ((7) holds then g holds. Thus if the mode of a call satisfies g' 
then the mode also satisfies g as required. This clause thus yields the call pattern 
(pt(to),X2 A (#i VX3)). Similarly the first and third clauses contribute the patterns 
(pt(to), 1) and (pt(uJ) <-i 2 A (a?i V £4)). Observe also that 

1 A (x2 A (#1 V S3)) A (^2 A [x\ V x 4 )) = x 2 A (x\ V (x 3 A X4)) 

which gives the final call pattern formula for pt (w) in D^- The gfp is reached at 
D3 since D4 = D3. The gfp often expresses elaborate calling modes, for example, 
it states that pt(«J) cannot generate an instantiation error (nor any predicate that 
it calls) if it is called with its second, third and fourth argument ground. This is a 
surprising result which suggests that the analysis can infer information that might 
be normally missed by a programmer. 



2.4-6 Restrictions posed by the framework 



The chief computational requirement of the analysis is that the input domain is 
equipped with a pseudo-complement operation. As already mentioned, it is always 



possible to systematically design a domain with this operator ( Giacobazzi fc Scoz 
sari, 1998) and any domain that is known to be condensing (see section 1.2) comes 
equipped with this operator. Currently, however, there are only a few domains with 



a pseudo-complement. Indeed, the domain described in ( Codish fe Lagoon, 2000 ) 
appears to be unique in that it is the only type domain that is condensing. This is 
the main limitation of the backward analysis described in this paper. 

Pos is downward-closed in the sense that if a function / describes a substitutions, 
then / also describes all substitutions less general than the substitution. The type 



domain of (Codish & Lagoon, 200C) is also downward-closed. It does not follow, 
however, that a domain equipped with a pseudo-complement operation is necessar- 
ily downward-closed. Heyting completion, the domain refinement technique used 



to construct pseudo-complement, can be moved to linear implication (Giacobazzi 
et al, 1998| ), though the machinery is more complicated. However, it is likely, that 
in the short term tractable condensing domains will continue to be downward- 
closed. In fact, constructing tractable downward-closed condensing domains is a 
topic within itself. 



3 Preliminaries 
3.1 Basic Concepts 



Sets and sequences Let N denote the set of non-negative integers. The powerset 
of S is denoted p(S). The empty sequence is denoted e and S* denotes the set of 
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(possibly empty) sequences whose elements are drawn from 5. Sequence concate- 
nation is denoted • and the length of a sequence s is \s\. Furthermore, let s° = e and 
s n = s ■ s" -1 where n G N. If n G N and s G N* then max(n • s) = max(n, max(s)) 
where max(e) = 0. 

Orderings A pre-order on a set 5 is a binary relation C that is reflexive and transi- 
tive. A partial order on a set 5 is a pre-order that is anti-symmetric. A poset (5, C) 
is a partial order on a set 5. If (5, Q) is a poset, then C C 5 is a chain iff a C b or 
b C a for all a, b G C. A meet semi-lattice (L, C, n) is a poset (L, C) such that the 
meet (greatest lower bound) n{x, y} exists for all x, y G L. A complete lattice is 
a poset (L, C) such that the meet FIX and the join UA (least upper bound) exist 
for all X C L. Top and bottom are respectively defined by T = n0 and _L = U0. A 
complete lattice is denoted (L, C, n, U, T, _L). Let (5, C) be a pre-order. If X C 5 
then j(A) = {y G 5 | 3x G X.y C x}. If x G 5 then |(x) = |({x}). The set of order- 
ideals of 5, denoted p*(5), is defined by p^(S) = {X C 5 \ X = j(A)}. Observe 
that (pJ-(S'), C, U, fl, 5, 0) is a complete lattice. 

An algebraic structure is a pair (5, Q) where S is a non-empty set and Q is 
collection of n-ary operations / : S n — > 5 where n G N. Let (5, C) and {S', C') be 
posets and (S 1 , Q) and (5', Q') algebraic structures such that Q = {ft \ i G 1} and 
2' = {fl I * € ^} f° r an index set /. Then a : S — > S" is a semi-morphism between 
(5, Q) and (S", Q') iff a(/i(ai, . . . , «„)) E //(a(ai), . . . , a(*„)) for all . . . , s n ) G 
S 1 " and i G /. 

Functions and fixpoints Let / : A — > B. Then dom(/) denotes the domain of / 
and if C C A then /(C) = {/(c) | c G C}. Furthermore, cod(/) = /(dom(/)). 
Let (L, C,U, and (i', C', U', fl') be complete lattices. The map / : £ — > i' is 
additive iff /(UX) = U'/(X) for all A C X; / is continuous iff /(UC) = U'/(C) 
for all chains C C L; / is co-continuous iff /(l~lC) = n'/(C) for all chains C C L 
and / is monotonic iff /(x) C' /(y) for all x Q y. Let x C y. If / is continuous 
then f(y) = f(x U y) = l_l'{/(x), f(y)} and thus /(x) /(y). If / is co-continuous 
then /(x) = /(x n y) = n'{/(x), /(y)} and thus f(x) C' /(?/)• Both continuity 
and co-continuity thus imply monotonicity. If / : £ — ► L, then / is idempotent 
iff /(x) = / 2 (x) for all x G L and / is extensive iff x C /(x) for all x e L. 
The Knaster-Tarski theorem states that any monotone operator / : L — > L on a 
complete lattice (i, U, n, T, _L) admit both greatest and least fixpoints that are 
characterised by gfp(/) = U{x G L | x C /(x)} and lfp(/) = l~l{x G L | /(x) C x}. 
If / is co-continuous then gfp(/) = n ne N/"(T) and dually if / is continuous then 
lfp(/) = U„ eN /"(_L). {/"(T) n G N} and {/"(_L) | n G N} are, respectively, the 
lower and upper Kleene iteration sequences of /. 

Galois insertions and closure operators If (S, Q) and (S' , C') are posets and 
a : S — > S' and 7 : S' — > 5 are monotonic maps such that Vx G 5.x C 7(a(x)) 
and Vx' G S".a(7(x')) C' x', then the quadruple (5,7, 5", a) is a Galois connection 
between 5 and 5'. In other words, a is the lower (or left) adjoint of 7 and 7 is the up- 
per (or right) adjoint of a. If, in addition, Vx' G S'.x' C' a(7(x')), then (5, 7, 5', a) 
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is a Galois insertion between 5* and S'. The operator p : L — > L on a complete lat- 
tice (L, C) is a closure operator iff p is monotonic, idempotent and extensive. The 
set of closure operators on L is denoted uco(L). The image set p(L) of a closure 
operator p is a complete lattice with respect to C. A Galois insertion (L,j,L',a) 
between the complete lattices L and L' defines the closure operator p = 70 a. 
Conversely, a closure operator p : L — > L on the complete lattice (L, C, U) defines 
the Galois insertion (L, id, p(L), p) where id denotes identity. Galois insertions and 
closure operators are thus isomorphic, though closure operators are typically more 
succinct and hence used in this paper. 

Substitutions Let Sub denote the set of (idempotent) substitutions and let Ren 
denote the set of (bijective) renaming substitutions. 



3.2 Cylindric constraint systems 

Let V denote a (denumerable) universe of variables and let C denote a constraint 
system over V. An algebra (C, <, <g>, 1, {3 x } x6 y, {d x , y }x,y&v) is a semi-cylindric con- 
straint system iff (C, <,®) is a meet semi-lattice with a top 1; 3^ is a family of 
(unary) cylindrification operations such that: c < 3 x (c), 3 x (c) < 3 x (c') if c < d , 
3 x (c® 3 x (c')) = 3 x (c) <8> 3 x (d); and d X; y is a family of (constant) diagonalisation 
operations such that: d XtX — 1, d x , y — 3 z {d XtZ (£> d z>v ) and d XtV (8) 3 x (c<8> d XtV ) < c 
ii x =/= y. Cylindrification captures the concept of projecting out a variable (and is 
useful in modeling variables that go out of scope) whereas diagonalisation captures 
the notion of an alias between two variables (and is useful in modeling parameter 



passing). (The reader is referred to (Giacobazzi et ai, 1995) for further details on 



cylindric constraint systems and their application in abstract interpretation.) 
Example 3.1 

An equation e is a pair (s = t) where s and t are terms. A finite conjunction of 
equations is denoted E and Eqn denotes the set of finite conjunctions of equa- 
tions. Let eqn(9) = {x = t \ x 1— > t G 9} and unify(E) = {9 G Sub | V(s = 
t) G E.9(s) = 9(t)}. Eqn is pre-ordered by entailment Ei < E 2 iff unify(Ei) C 
unify(E 2 ) and quotiented by E\ « E 2 iff E\ < £?2 and £?2 <! -Ei- This gives 
the meet semi-lattice (Eqn/ «, <, ®) with a top 1 where conjunction is defined 
® [Sa]« = [-Ei U £2]^ and 1 = [0]~. Let mgu(E) = {9 G unify(E) | Vk G 
unify(E) . eqn(n) <eqn(9)} . Finally, let d XiV = [{x — y}] a and define project out by 
3 X ([E]~) = [eqn({y ^ t G 9 \ x ^ y})]& if G mgu(E). Otherwise, if mgu(E) = 0, 
define 3^ ([£*]-) = [{a = 6}]~ where a and & are distinct constant symbols. Then 
(Eqn/~, <, ®, 1, {3 a; } 2 ; 6 y, {d^yl^ygy) is a semi-cylindric constraint system. 

An algebra (C, <, ©, <8>, 1, 0, {3 X } X&V , {d XtV } x,yev) that extends a semi-cylindric 
constraint system to a complete lattice (C, <, ©, (8), 1, 0) is a cylindric constraint sys- 
tem. A semi-cylindric constraint system can be lifted to a cylindric constraint sys- 
tem via a power-domain construction. In particular (p^(C), C, U, D, C, 0, {3 x } xe y, 
{dj } Ii!)£ V') is a cylindric constraint system where 3' X (C) = i({3 x (c) | c G C}) and 

^a;,j/ = l(d X ,y)- 
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Example 3.2 

The semi-cylindric system of example [O] can be lifted to the cylindric system 
(p l (Eqn), C, U, n, Eqn, 0, 3', d') where 3' X (C) = |({3 x (c)|c £ C}) and = |(d x ,„). 

In the sequel, unless otherwise stated, all constraint systems considered are 
over the same V and thus a cylindric constraint system will be simply denoted 
(C, <, ©, ®, 1, 0, 3, o?). Let var(o) denote the set of the variables in the syntactic 
object o and let FV(c) denote the set of free variables in a constraint c £ C, 
that is, FV(c) = {x £ var(c) | 3y £ V . c ^ 3^(0 <8> d^y)}. Abbreviate project 
out by 3{ Xl ,... lXn }(c) = 3-biG ■ • ( 3 x„(c))) and project onto by 3 x (c) = 3 F v(c)\x(c)- 
Let d^j = ^>i'—i<i Xu y i where x = {x\...x n ) and y = (y\...y n ). If c £ C then 
let (9|(c) denote the constraint obtained by replacing x with y, that is, 9H(c) = 
3?(3^(c ® d^^) ® d# t jf) where var(z) n (FV(c) U var(z) U var(y)) = 0. Finally, if 
C C C then flg (C) = {df (c) | c £ C}. 

Example 3.3 

Let X be a finite subset of V. The groundness domain (EPosx , \=, Y, A, 1, 0) 



flHcaton et a/., 2000| ) is a finite lattice where EPos x = {0} U {AF FCIU E^}, 
£ x = {x y | x,y £ X} and /i Y / 2 = A{/ £ EPos x \ f 1 \=fAf 2 \= /}. PPos* 
is a cylindric constraint system with d XjV — (x O y) and 3 X (/) = /' A /" where 
/' = HV eY\f\=y}, f" = A{e £ E Y \ f |= e} and Y = X \ {x}. ' 

Example 3.4 

Let Boolx denote the Boolean functions over X. The dependency domain Posx 



( [Armstrong et ai, 199S| ) is defined by Pos* = {0} U {/ £ Bool x \ AX |= /}. 
Henceforth V abbreviates AY". The lattice (Posx, (=, V, A, 1, 0) is finite and is a 
cylindric constraint system with d x ^ y — (x y) and Schroder elimination defining 
3 x (f) = f[x^l]Vf[x^0]. 



3.3 Complete Heyting algebras 

Let (L, C, n) be a lattice with x,y £ L. The pseudo-complement of x relatively to 
y, if it exists, is a unique element z E L such that x\l w C y iff u> C z. L is relatively 
pseudo-completed iff the pseudo-complement of x relative to y, denoted x — > y, 
exists for all x, y £ L. If L is also complete then it is a complete Heyting algebra 
(cHa). If x, y £ L then a; l~l (x — > y) = x n y. Furthermore, if (L, C, U, n) is a cHa 
then a; — > y = U{u> £ L | a; n w C y}. The intuition behind the pseudo-complement 
of x relative to y is that it is the weakest element whose combination (meet) with 
x implies y. Interestingly pseudo-complement can be interpreted as the adjoint 



of conjunction. (The reader is referred to (van Dalen, 1997) for further details 



on complete Heyting algebras.) The following result (Birkhoff, 1967)[Chapter IX 



Theorem 15] explains how a cHa depends on the additivity of meet. 
Theorem 3.1 

A complete lattice L is relatively pseudo-complemented iff x n (UY) = U{x n y 
y £ Y} for all x € L and Y C L. 
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Example 3.5 

Let {x,y} C X and / = (x <^ y). Then returning to EPosx of example 3.3 
/ A (Y{x, y}) = / A (1) = / ^ (x A y) = Y{x A y, x A y} = Y{/ A x, / A y}. Hence, 
by theorem |3.lL EPosx is not a cHa. Now consider Posx of example 3.4, and 



specifically let / € Posx and G C Posx - Since A distributes over V, it follows that 
/ n (UG) = U{/ n fif | g e G}, thus by theorem O Posx is a cHa. Similarly, n 
distributes over U, and thus it follows by theorem 3.1 that p^(C) is also a cHa. 



3-4 Constraint logic programs 

Let II denote a (finite) set of predicate symbols, let Atom denote the set of (flat) 
atoms over IT with distinct arguments drawn from V, and let (C, <, ©, ®, 1, 0, 3, d) 
be a semi-cylindric constraint system. The set of constrained atoms is defined by 
Base = {p(x) :-c | p{x) G Atom Ac 6 C}. Let FV(p(x) :-c) = var(x) U FV(c). 
Entailment < lifts to Base c by u>i < iff 3j(cZj ! ^ 1 (g) ci) < 3#(d#aj 2 <S> C2) where 
uij =p(£j) :-Cj and var(x) n (i*V(u;i) U FV{w2)) = 0- This pre-order defines the 
equivalence relation w\ w u>2 iff u>i 5! W2 and u>2 <! 101 to give a set of interpretations 
defined by Int c = p(Base c / '»). Ini c is ordered by Ji C 72 iff for all [lojw G h 
there exists [^2]^ G I2 such that wi < W2- Let = denote the induced equivalence 
relation l\ = 1% iff I\ E 7 2 and I2 E -fi- (Int c / =, U, n, T, _L) is a complete 
lattice where [h}= U [I 2 ]= = [h U J 2 ]=, [Ii]= n [7 2 ]= = [U{7 | / E /1 A 7 E I a }]„, 
T = [{[p(x) :- 1]~ I p(x) G Atom}}= and _L = [0]=. 

A constraint logic program P over C is a finite set of clauses w of the form 
1/7 = h:-c,g where h £ ^4tom, c G C, <? G Goa/ and GoaZ = Atom*. The fixpoint 
semantics of P is defined in terms of an immediate consequences operator Tp. 

Definition 3.1 

Given a constraint logic program P over a semi-cylindric constraint system C, the 
operator Tp : Int c — > Jrrf c is defined by: 



Tp(I)={\p(x):-c'] f 



3 p(x) :-C,£>i(xi), . . . ,Pn(x n ) G P 

3 {[Pi(^):-CiW? =1 CJ 



The operator JFg lifts to Int c /= by ^ ([/] = ) = [JFg (/)] = . The lifting is monotonic 
and hence the fixpoint semantics for a program P over C exists and is denoted 



T C {P) = lfp(^>). (The reader is referred to (Bossi et al, 1994; Jaffar fc Mahcr 



1994) for further details on semantics and constraint logic programming.) 

The operational semantics of P is defined in terms of a transition system — fp 
between states of the form State = Goal x C. To define the transition system, 
let FV((g;c)) = var( 3 ) U FV(c) and FV(h;-c,g) = var(/i) U FV(c) U var(y). To 
rename clauses with ip G Ren it is necessary to rename constraints with tp. Thus 
define tp(h :- c, g) = <p(h) :- (c), tp{g). To rename apart from a syntactic object o, 
let w <C P indicate that there exists w' G P and ip G Ren such that var(cod(</?)) n 
FV(iu') = 0, pfy) = u; and FV{o) n FV(u;) = 0. 



Definition 3.2 
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Given a constraint logic program P over a semi-cylindric constraint system C, 
-^pC State 2 is the least relation such that: 

s = (p{x), g; c) ^ P (g', g;c® d s ^> <g> c') 

where p(x') :- c', g' <$^ s P. 

The operational semantics is specified by the transitive closure of the transition 
relation on (atomic) goals, that is, O c (P) = [{[p(x) :- c}~ \ (p(x);l) (e;c)}]=. 
The relationship between the operational and fixpoint semantics is stated below. 

Theorem 3.2 
O c (P)=T c (P). 



3.5 Abstract semantics for constraint logic programs 

To apply abstraction techniques and finitely characterise T c (P) , and thereby O c (P) , 
the semi-cylindric domain C is replaced by the cHa p^(C) which is particularly 
amenable to approximation and backward reasoning. 

If P is a constraint logic program over C, then J,(P) = {h :- i{c),g | h:-c,g G P}. 
Furthermore, if / G Int c , then let !([/]=) = [{[p(x) :- i(c)}~ \ [p(x):-c\~ G !}} = . 
Note the overloading on w and hence =. The « of [p(x) :-c]~ is induced by (C, <} 
whereas the « of [p{x) :- |(c)]~ is induced by (p^C), C). The following proposition 
details the relationship between T c and T p ^ c > . 

Proposition 3.1 

1(t c (p))^t^U(p)). 

Let (C, <, ©, ®, 1, 0, 3, d) denote a cylindric constraint system. If p <E uco(C) then 
<, <8>) is a complete lattice. If p is additive, then (p(C), <, ©, (g>) is a sub-lattice 
°f (C, <!, ©, <8>)- More generally, the join is denoted ©'. Observe that p(C) has 1 and 
p(Q) for top and bottom and c\ © c-i < p(ci © C2) = ci ffi' C2 for all ci, C2 G p(C). 
A cylindric constraint system is obtained by augmenting p(C) with cylindrifica- 
tion 3' x and diagonalisation d' x operators. To abstract (C, <, ©, <g>, 1, 0, 3, d) safely 



with (p(C), <, ©', ©, 1, p(0), 3', d'), p is required to be a semi-morphism (Giacobazzi 



\et aL, 1995 ) which additionally requires that p(3 x (c)) < 3' x (p(c)) for all c G C and 
p(d XtV ) ^ d' x y for all x,y <E V. In fact, these requirements turn out to be relatively 
weak conditions: most abstract domains come equipped with (abstract) operators 
to model projection and parameter passing. 

Example 3.6 

Consider the cylindric system (p^ (Eqn), C, U, PI, Eqn, 0, 3, d) derived from the semi- 



cylindric system introduced in example 3.1. Let Bool — Booly and Pos = Posy 



Define a Pos : p L (Eqn) -> Pos by a Pos (C) = V{a(0) | 9 G mgu(E) A E G C} 
and a(9) — A{x vax(i) \x 1— ► t G 0}. Also define 7p os : Pos — ► p^(Eqn) by 
lPos{f) = U{C G p l (Eqn) \ a Pos (C) |= /} and observe /jp os G uco(p l (Eqn)) where 
PPos = 7Pos o/p os . To construct a semi-morphism, put = 7p os (^ y) and 
%i c ) = lPos{f[x i-> l]V/[x ^> 0]) where / = a Pos (C). Then pp os {d x ,y) Q d' xy and 
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p Pos (3 x (C)) C 3' x (p Pos (C)) for all C G p\Eqn). Note that C x C\C 2 = 7 p os (/iA/ 2 ) 
and Ci©'C 2 = 7p os (/i V/ 2 ) where Q = ^p s{fi)- Surprisingly C*i ®'C* 2 + C"iUC 2 



(File & Ranzato, 1994), as witnessed by Ci = 7p os (x) and C 2 = ^p os {x <^ j/) since 
{y = f(x, z)} <^C 1 UC 2 whereas a Pos {{y = f(x, z)}) = y 4^ (x A z) |= x V (x ^ y) 
so that {y = f(x,z)} G Ci ©' C 2 = -yp os (x V (a; 4=> y)). Nevertheless, pp os is 
a semi-morphism between (p^(Eqn),C.,\J,n,Eqn,®,3,d) and (pp os (p^(Eqn)) ) C 
,e',n,Sgn,pp o .(0),3',d'>. 

The operator p lifts to the complete lattice Int c / = by p([i] = ) = [/?(/)]= where 
p(I) — {[p{x) '■- p(c)]~ I \p(x) :-c}~ G /}. Thus p G uco(Int c /=). It is also useful to 
lift p to programs by p{P) — {h :- p{c),g | h :- c, g G P}. The following result relates 
the fixpoint semantics of P to that of its abstraction p(P). 

Theorem 3.3 

Let C be a cylindric constraint system. If p G uco(C) is a semi-morphism, then 
P{F C {P)) ^T cod(p Hp(P)). 

Corollary 3.1 

Let C be a semi-cylindric constraint system. If p G uco(p^(C)) is a semi-morphism, 
thenp(I(^ c (F)))C^ d (")(p(I(P))). 



4 Constraint logic programs with assertions 



We consider programs annotated with assertions ( Drabent fc Maluszyhski, 1988 ). 
When considering the operational semantics of a constraint logic program, it is 
natural to associate assertions with syntactic elements of the program such as pred- 
icates or the program points between body atoms. Without loss of generality, we 
decorate the neck of each clause with a set of constraints C that is interpreted as 
an assertion. When C is encountered, the store c is examined to determine whether 
c G C (modulo renaming). If c € C execution proceeds normally, otherwise an error 
state, denoted <0, is entered and execution halts. 

To formalise this idea, let C be a semi-cylindric constraint system and 
p G uco{p\C)). The assertion language (in whatever syntactic form it takes) is 
described by p. A clause of a constraint logic program over C with assertions over 
cod(p) then takes the form h :- Coc, g where h G Atom, C G cod(/3), c G C, g G Goal 
and © separates the assertion from the body of the clause. Notice that C is an order- 
ideal and thus downward closed. (C can thus represent disjunctions of constraints, 
but the semantics presented in this section should not be confused with a collecting 



semantics.) Note also that program transformation (Pucbla et at, 2000a) can be 
used to express program point assertions in terms of our assertion language. To 
specify the behaviour of programs with assertions, let State^ — State U {()}, and 
let CLP(P) = {h:-c,g | h:-C © c,g G P}. The following definition details how the 
operational semantics for the assertion language is realised in terms of projection, 
renaming and a test for inclusion. 



Definition ^.1 
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Given a constraint logic program P over a semi-cylindric constraint system C with 
assertions over p(p^(C)), =>pC State x States is the least relation such that: 

r if p{x'y.-C oc',g' G P 

s = (p(x), g; c)^ P l A &* (3 s (c)) £ C" 

[ (g', g;c® d s , s , <g> d) else if p(x') :- d, g' < s CLP(P) 

Recall that p(x') :-d,g' <C S CLP(P) ensures that the clause p{x') :-d,g' does not 
share any variables with s. The operational semantics of P is then defined in terms 
of =4>p as A P ' C (P) = [{[p(f) :-c]rj | (p(f); 1) =^p (e;c)}]=. The relationship between 
two operational semantics is stated in the following (trivial) result. 

Proposition ^.1 
APfi{P) C O c (CLP(P)) 

Assertions are often used as interface between behaviour that is amenable to for- 
malisation, for example as an operational semantics, and behaviour that is less 



tractable, for example, the semantics of a builtin (Puebla et ai, 2000b). More to 
the point, it is not always possible to infer the behaviour of a builtin from its def- 
inition, partly because builtins are often complicated and partly because builtins 
are often expressed in a language such as C. Our work requires assertions for each 
builtin in order to specify: its calling convention (for example, which arguments are 
required to be ground) and its success behaviour (for example, which arguments 
are grounded). 



5 Backward fixpoint semantics for constraint logic programs with 

assertions 

Let P be a constraint logic program over the semi-cylindric constraint system C 
with assertions over p(p^(C)). One natural and interesting question is whether the 
error state is reachable (or conversely not reachable) in P from an initial state 
(p(x);c). For a given constraint logic program P with assertions, the backward 
fixpoint semantics presented in this section infers a (possibly empty) set of c G C 
for which (p(x); c) ^>* P 0- The semantics formalises the informal backward analysis 
sketched in section ||. 

For generality, the semantics is parameterised by C and p. The correctness ar- 
gument requires p to be a semi-morphism between (p-L(C), C, U, F\,C, 0, 3, d) and 
(p(p i (C)),C,e / ) n ) C,p(0) ) 3 / ,d / ). Additionally, p(p l (C)) must be a cHa, that is, it 
must possess a pseudo-complement — To explain, how pseudo-complement aids 
backward analysis consider the problem of inferring c £ C for which (g; c) ^?* P 
where g = pi(xi) , . . . , p n (x n ) . Suppose /, S p(p\C)) describes the success pat- 
tern for pi(xi), that is, if (pi(xi);l) -^* P (e;c) then c G /j. Moreover, suppose 
di G p(p^(C)) approximates the initial call pattern for Pi(xi), that is, if c G d, 
then (pi{xi)\ c) ^>* P 0- Observe that {p n -i(x n -i),p n (x n ); c) ^> P if c G d n _i n e 
and e Pi (d n -i fl f n -x) C d n . This follows since (p„_i(x n _i); c) ^> P because 
c G dn-i He C d n _i. Moreover, if (p n _ 1 (if T ,_ 1 ), p„(a:„); c) ^p (p n (x n );d) then 
c' G (d n _i n e) fl /„_i C d„ and thus (p n {x n )\ d) ^ P 0- Putting e = p(0) ensures 
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e fl (d n -i fl fn-i) Q d n and thereby achieves correctness. However, for precision, 
d n -\ n e should be maximised. Since p(p i -(Cj) is a cHa, this reduces to assigning 
e = ®'{e' G p(p l (C)) | e' n (d„_i n f n -i) C d„} = (d„_i n /„_i) -►' d„. In general, 
without pseudo-complement, there is no unique best e that maximises precision (see 
example 5.1). The construction is generalised for g = pi(xi), . . . ,p n (x n ), by putting 
e„ = C and &i = di n ((dj n /,:) — >' e i+ i) = dj n (fi — e i+ i) for 1 < i < n. Then 
(<?; c) if c € ei as required. This iterated application of — »' to propagate 
requirements right-to-left is the very essence of the backward analysis. 

Example 5.1 

Returning to examples 3.2-3.5, let a E p os {C) = X{a(9) \ 9 G mgu(E) A E G C}, 
lEPosif) = U{C G p l {Eqn) | q;£;p os (C) |= /} and ppp os = 7bp 0;s o a B p os . 
Note that dnC 2 = j E Pos(fi A / 2 ) and d ©' C 2 = i E Po*{h Y / 2 ) where 
C; = jEPos(fi)- By defining 3' and d' in an analogous way to example 3.5, a 
semi-morphism pepos is constructed between (p^-(Eqn), C, U, fl, Sgn, 0, 3, d) and 
(pEPos(p l ( E( l n ))i & n > ^9^! Pepos($), 3', d'}. Recall that p E Pos(p l ( E <l n )) is n ^ 
a cHa. Now consider the problem of inferring an initial c for (p„_i(a? n _i),p n (:r n ); c) 
within p EPos (p\Eqn)). In particular let d n _i = 7pp os (l), /„_i = j E p os (x <=> y) 
and d„ = Jep os (x A y). Then e,- n (d„_i ("1 /„-i) C d„ for ei = ^ep os {x) and 
62 = lEPos{y) but (ei ©' e 2 ) n (d„_i n fn-i) = 7£Po S ((a; Y y) A 1 A (a; ^ y)) = 
"fEPos(x O y) % "fEPos(xAy) = d n . Thus there is no unique e maximising precision. 



Example 5.2 

Identity p^ = Xx.x is the trivial semi-morphism between (p^(C), C, U, PI, C, 0, 3, d) 
and (p^C), C, U, fl,C, 0, 3, d) where the pseudo-complement is given by G\ — >' C 2 = 
{c G C | Vc' < c . c' G Ci => c' G C 2 } (|Birkhoff, 1967)). 



Example 5.3 

Recall that pp os is a semi-morphism between (p^(Eqn), C, U, fl, -Egn, 0, 3, d) and 

PPo«(0); 3', d'). Alth ough ®' ^ U, pp os { p^{Eqn)) is 
a sub-cHa of p^(Eqn) with respect to n and — »•' ( Bcozzari, to appear ). Moreover, 
pseudo-complement (intuitionistic implication) — »•' coincides with classic implica- 
tion =>■ in the sense that G\ — C 2 = 7p os (/i =>■ /a) where C 2 : = 7p os (/i)- 
This follows since V |= / 2 |= V / 2 and thus /1 => / 2 6 Pos. Moreover, 

/iA/h /a iff H (A A /) / 2 iff h / => Hi) V /a iff / h (-/l) V h- Hence 
d C 2 = ®'{C G ^(pH^n)) |dnCCC 2 } = 7p os (/! / 2 ). Thus is 
finitely computable for pp os . Finally note that -1 and V are defined on Bool rather 
than Pos since -1/ ^ Pos iff / G Pos. 



Example 5.4 

Now consider the problem of inferring an initial c for (p n _i(a?„_i) , p n (x n ); c) within 
PPo^pK-E 1 *?"))- Analogous to example p3| let d„_i = 7p os (l), f n -i = 1p os {x O y) 
and d„ = 7p os (a; A y). Then e 3 n (d„_i n f n -i) Q d n for e% = 7p os (a;) and e 2 = 
lPos{y) and (ei©'e 2 )n(d„_in/„_i) = j Pos ((xVy) MA(x ^ y)) = -fp os {xAy) = d„. 
Thus there is a unique e maximising precision. 
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Since (p(p^(C)), C, D,C, p(0), 3', rf') is a cylindric constraint system, it follows 
that e C 3' x (e) for all e G p(p^(C)). A consequence of e C 3' x (e) is that projection 
approximates from above. Approximation from above, however, is not entirely ap- 
propriate for backward analysis. In particular, observe that if (g; c) j^p for all 
c G e, then it docs not necessarily follow that (g;c) ^>* p for all c G 3' x (e). What 
is required is a dual notion of projection, say denoted V, that approximates from 
below. Then (g; c) ^>p for all c G V x (e). Although V" is an abstract operator, the 
concept is defined for an arbitrary cylindric constraint system for generality. 

Definition 5.1 

If (C, <, ©, ®, 1, 0, 3, d) is a cylindric constraint system and icF then y x : C — > C 
is a monotonic operator such that: 3 x (V x (c)) < c and c < V x (3 x (c)) for all c G C. 

Recall that 3 X is monotonic and thus a is the lower adjoint of 7 and 7 is the upper 
adjoint of a. More exactly, it follows that V x can be automatically constructed from 
3 X by V x (c) = ®{c' G C I 3 x (c') < c}. Observe that this ensures that V x is the most 
precise projection operator from below. For succinctness, define V{ Xli ... ;Xn }(c) = 
V X1 (. . . (V Xn (c))) and Vjc(c) = V FV(c)VC (c). 

Example 5.5 

For p id , let V X (C) - |({c G C | 3 x (c) = c}). 



Example 5.6 

For p Pos , let V X (C) = 7Pos (/') if /' G Pos otherwise Y X {C) = lPos {Q) where C = 
jPos(f) and /' = f[x 1 — ► 0] A f \x 1 — > 1 ] . Observe that 3 x (/)[a; 0] A3 x (/)[x >-> 1] = 
3 X (/) and hence C C 3 X (C) = V X (3 X (C)) as required. Moreover, if V X (C) = 7 Po;s (0) 
then 3 X (V X (C)) = 7 Po;s (0) C C. Otherwise 3 x (/[a; ^ 0] A /[a; 1]) = f[ x ^ 
0] A/[x >-> 1] f= /. Thus 3 X (V X (C)) C C as required. Finally, note that V x is finitely 
computable for pp os . For example if d — Jpos(fi), fi = {x<= y), fi = (x Ay) and 
h = (xV y), then V x {d) = Jp os (ti) where f[ = 0, = and = y. 

Backward analysis can now be formalised as follows. 

Definition 5.2 

Given a constraint logic program P over a semi-cylindric constraint system C with 
assertions over p(p l (C)), the operator V p p fi : Int cod ^ -> Int cod ^ is defined by: 



c>) = U 



V 


\p(x) 


- e] rj £ E 


V 


p(x) :-Coc,pi( 


fl), . . . ,p n (x n ) G P 


3 


{\pi(xi) :- 


/i]«}?=l c P 


3 


{[Pi(^i) > 


di]«}2=i c P 




e n+ i = C A e. 


= in (/i -►' e i+ i) 




e C Vj(eo) A e 


= Cn(p(j(c))-' ex) 



where [P]^ = J" cod ^ ( i o(|(CLP(P)))). 

Since P is parameterised by p and C it can interpreted as a backward analysis frame- 
work. T> requires P, the success patterns of the program obtained by discarding the 
assertions, to be pre-computed. V considers each clause in the program in turn and 
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calculates those states which ensure that the clause (and those it calls) will not vi- 
olate an assertion. An abstraction which characterises these states is calculated by 
propagating requirements, represented as abstractions, right-to-left by repeated ap- 
plication of pseudo-complement. Projection from below then computes those states 
which, when restricted to the head variables, still ensure that no error arises in the 
clause (and those it calls) . Repeated application of V yields a decreasing sequence 
of interpretations. 

The operator V p p fi lifts to Int cod ^ / = by V p f ([£>]=) = [D p p c (D)} = . Since 
(cod(p), C, U, n) is a complete lattice, V p will possess a gfp if V p c is monotonic. 
The existence of gfp(Pp C ) is guaranteed by the following result since co-continuity 
implies monotonicity. 



Proposition 5.1 

V p p fi : Int cod ^/= -> Int cod ^/= is co-continuous. 

Since gfp(£>p C ) exists, a backward fixpoint semantics can be defined 
V P ' C (P) — gfp(2?p ) and computed by lower Kleene iteration. To establish a con- 
nection between T> p ' (P) and the operational semantics of P, it is useful to an- 
notate the goals of a state with their depth in the computation tree. To for- 
malise this idea =>p is lifted to the annotated states Con/<> = Conf U {0} where 
Conf = Goal x C x N* to obtain the transition system ^p. 

Definition 5.3 

Given a constraint logic program with assertions P over a semi-cylindric constraint 
system C, ^pC Conf x Conf^ is the least relation such that: 



(p(x),g;c;n- h) 



if (p(x),g;c)^ P <> 

(g',g;c';(n + l)^ ■ h) if (p(x),g;c) ^ P (g',g;c') 



The sequence (n + l)' 9 denotes \g'\ concatenations of n + 1. The following result 
relates the depth of the goals of the annotated states to the iterates obtained by 
lower Kleene iteration. Informally, it says that if a constrained atom p(x) :- e occurs 
in the interpretation obtained by applying V k times, and e characterises an initial 
state (in a certain sense), and the depth of the goals in a derivation starting at the 
initial state does not exceed k, then the derivation will not violate an assertion. 
The main safety theorem flows out of this result. 

Lemma 5.1 

Let (p(y);c";l) = s x ^p s n ^ P 0, s, = and {V p fi ) k {T)_ = [D k ]=. 

If max({max(/ii) | 1 < i < n}) < k and [p{y) :-e]~ G Dk then 3g(c") $ %(e). 

Theorem 5.1 

If VP- C (P) = [D]=, \p(y) :-eU G D and c G %(e) then (p(y);c) i> P 0. 

6 Experimental evaluation 

In order to evaluate the usefulness of the analysis framework presented in section [s], 
a backward Pos analyser has been constructed for inferring calling modes. The 
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fixpoint component of the analyser is coded in SICStus Prolog 3.8.3. The domain 
operations are coded in C and are essentially the binary decision diagram (BDD) 
routines written by Armstrong and Schachte (Armstrong et at, 1998). The analyser 
takes, as input, a program written in a declarative subset of ISO Prolog. It outputs 
a mode for each program predicate. The safety result of theorem 5.1 ensures that 
if a call to a predicate is at least as instantiated as the inferred mode, then the call 
will not violate an instantiation requirement. Modes are expressed as grounding 
dependencies ( Armstrong et ai, 199S| ). 

The implementation follows the framework defined in section || very closely. The 
analyser was straightforward to implement as it is essentially two bottom-up fix- 
point computations: one for T and the other for T>. The only subtlety is in handling 
the builtins. For each builtin, it is necessary to select a grounding dependency that 
is sufficient for avoiding an instantiation error. This is an lower approximation (the 
required mode of table |l|). It is also necessary to specify behaviour on success. This 
is an upper-approximation (the success mode of table |l|) . The lower approximations 
are the assertions that are added to Prolog program to obtain a constraint logic 
program with assertions. 

Interestingly, the success mode does not always entail the required mode. Univ 
(=..) illustrates this. A sufficient but not necessary condition for univ not to error is 
that either the first or second argument is ground. This cannot be weakened in Pos 
(but could be weakened in a type dependency domain (Codish & Lagoon, 2000) 
that expressed rigid lists). The success mode is that the first argument is ground 
iff the second argument is ground (which does not entail the required mode). Note 
too that keysort and sort error if their first argument is free. A sufficient mode 
for expressing this requirement is that the first argument is ground. Again, this 
requirement cannot be weakened in Pos. 

The analyser has been applied to some standard Prolog benchmarks which can 
be found at http: / / www. Oakland. cdu/^121u/benchmarks-BG. zip . The results of the 
analysis, that is, the calling modes for the predicates in the smaller benchmarks, are 
given in table |[ The results, though surprising in some cases (see sort of permSort 
and insert of treesort for example) have been verified by hand and appear to be 
optimal for Pos. The analysis, of course, can be applied to larger programs (though 
it becomes very difficult to verify the results by hand) and table [| demonstrates 
that the analysis scales smoothly to medium-scale programs at least. The table lists 
the larger benchmarks (which possibly include some unreachable code) in terms of 
increasing size measured by the total number of atoms in the source. The abs 
column records the time in milliseconds required to read, parse and normalise the 
source into the ground program representation used by the analyser; Ifp is the time 
needed to compute the fixpoint characterising the success modes; gfp is the time 
needed to compute the calling modes; and finally sum is the total analysis time. 
This includes the (usually negligible) overhead of annotating the source with the 
modes required by builtins. Timings were performed on a Dell GX200 1GHz PC 
with 128 MB memory running Windows 2000. The timings suggest that the analysis 
is practical at least for medium-scale programs (though the running time for BDDs 
can be sensitive to the particular dependencies that arise). Moreover, with a state- 
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builtin 


required mode 


success mode 


tl == t 2 , tl \== t 2 , tl @< t 2 , tl @> t 2 , 


true 


true 


ti @=< ti, ti @>= ti, ti \= ti, !, 






compound(ti), display(fi), listing, listing(ti), 
nl, nonvar(ti), print(ti), portray_clause(£i), 






read(ii), repeat, true, var(ti), write(ti), 
writeq(ti) 






atom(fi), atomic(fi), compare(ti, t 2 , ts), 
float(ti), ground(ti), integer(ti), number(fi) 


true 


fi 


length(ti,i 2 ) 


true 


h 


statistics^ , t 2 ) 


true 


9i 


abort, fail, false 


true 


false 


keysort(fi, t 2 ), sovt(ti,t 2 ) 


fi 


92 


tab(ii), put(ti) 


fi 


fi 


ti is t 2 


h 


9i 


ti =:= t 2 , ti =\= t 2 , ti < t 2 , ti > t 2 , 


9i 


9i 


ti =< t 2 , ti >= t 2 






arg(ti, t 2 , t s ) 


9i 


93 


name(ti,t 2 ) 


94. 


9i 


h =.. t 2 


94 


92 


functor(ti, t 2 , t^) 


95 


96 



Table 1. Abstracting builtins where fi = Avar(^), g\ = fi A/2, g% = fx fi, 
93 = fx A (/ 2 / 3 ), 94 = /iV / 2 , g 5 =/iV (/ 2 A / 3 ) and g 6 = f 2 A / 3 . 



of-the-art GER factorised BDD package (Bagnara & Schachte, 199£ ) the analysis 
would be faster. Interestingly, the time to compute the lfp often dominates the 
whole analysis. BDD widening will be required to analyse very large applications 
but this is a study within itself (Heaton et ai, 200C). 



7 Related work 



Our work was motivated by the recent revival of interest in logic programming 



with assertions (Boye et ai, 1997; Puebla et at, 2000a). For example, (Puebla 



et ai, 2000b) argues that it is useful to trap an unexpected call to a predicate with 



an assertion otherwise a program may error at a point that is far from the source of 



the problem. Moreover, ( Puebla et ai, 2000a ) observe that predicates are normally 
written with an expectation on the initial calling pattern, and hence provide an 
entry assertion to make the, moding say, of the top-level queries explicit. Our 
work shows how entry assertions can be automatically synthesised which ensure 
that instantiation errors do not occur while executing the program. 



The most closely related work concerns the demand analysis of ccp (Debray, 1993 



Falaschi et al, 200C). A demand analysis for the ccp language Janus (Saraswat et al 



1990) is proposed in (Debray, 1993) which determines whether or not a predicate is 



uni-modal. A predicate is uni-modal iff the argument tuple for each clause share the 
same minimal pattern of instantiation necessary for reduction. The demand analysis 
of a predicate simply traverses the head and guard of each clause to determine 
the extent to which arguments have to be instantiated. Body atoms need not be 
considered so the analysis does not involve a fixpoint computation. A related paper 
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benchmark 


predicate 


mode 


bubblesort 


sort(a;i, X2) 


Xl 




ordered ( x 1) 


x 1 




append(xi, X2, x 3 ) 


true 


dnf 


go 


true 




dnf(xi, X2) 


true 




normal, X2) 


true 




literal(a;i) 


true 


heapify 


greater(si, X2) 


Xl A X2 






1 (xi An) V \ 




adjust(a;i, X2, X3, xa) 


(xi A X2 A x 3 ) V 






\ (^x 2 A -^x 3 A x 4 ) J 




heapify (a; 1, X2) 


Xl 


permSort 


select (a; 1, X2, X3) 


true 




ordered (xi) 


Xl 




permutation(a;i , X2) 


true 




sort(a;i, X2) 


Xi V X2 


queens 


noattack(si, X2, xi) 


xi A X2 A x 3 




safe(a;i) 


Xi 




delete(a;i, x 2 , x 3 ) 


true 




perm(xi, X2) 


true 




queens(xi, X2) 


Xl V X2 


quicksort 


append(xi, X2, X3) 


true 




qsort(a;i, X2) 


Xi 




partitioii(:ri, X2, x$, X4) 


X2 A (xi V (X3 A Xi)) 


treeorder 


member(ii, X2) 


true 




select(a;i, X2, £3) 


true 




split(a;i, X2, X3, X4) 


true 




split(rEi, . . . , xt) 


true 




visits2tree(a;i, X2, 2:3) 


true 




v2t(x!,X2,X3) 


true 


treesort 


tree_to_list_aux(a:i, X2, 2:3) 


true 




tree_to_list(a;i , X2) 


true 




list_to_tree(cci , X2) 


Xi 




insert_list(a;i, X2, x$) 


Xi A X2 




insert(a;i, X2, X3) 


xi A (353 V x 3 ) 




treesort(a;i, X2) 


Xl 



Table 2. Precision of the Mode Analysis (small benchmarks) 



(Debray et at, 1992) presents a goal-dependent (forward) analysis that detects those 
ccp predicates which can be scheduled left-to-right without deadlock. If assertions 
are used to approximate synchronisation, then the analysis described in this paper 
can be re-interpreted as a backward suspension analysis of ccp under lcft-to-right 
scheduling. 

When reasoning about module interaction it can be advantageous to reverse 
the traditional deductive approach to abstract interpretation that is based on the 
abstract unfolding of abstract goals. In particular (Giacobazzi, 1998) shows how 
abduction and abstraction can be combined to compute those properties that one 
module must satisfy to ensure that its composition with another fulfils certain 
requirements. Abductive analysis can, for example, determine how an optimisation 
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file 


size 


abs 


Ifp 


gfp 


sum 


file 


size 


abs 


VP 


gfp 


sum 


astar 


100 


10 


10 





20 


tictactoe 


258 


20 


10 


10 


40 


fft 


104 


20 





10 


30 


jons2 


261 


20 


10 





30 


knight 


105 


10 








10 


kalah 


269 


30 


10 


20 


60 


browse_wamcc 


106 


10 








10 


draw 


289 


70 


91 


40 


201 


caLwamcc 


108 


10 


10 





20 


cs_r 


311 


40 


20 


10 


70 


life 


110 


10 


10 


10 


30 


reducer 


320 


40 


30 





70 


crypt_wamcc 


113 


10 








10 


sdda 


336 


20 


21 





41 


cryjnult 


118 


10 


10 


10 


30 


bryant 


349 


30 


120 


21 


171 


browse 


125 


10 


10 





20 


ga 


363 


50 


30 


20 


100 


bid 


128 


10 


10 





20 


neural 


378 


30 


10 





40 


disj_r 


148 


30 





10 


40 


press 


381 


30 


20 





50 


consultant 


151 


20 





10 


30 


peep 


414 


50 


20 


10 


80 


ncDP 


156 


10 


10 





20 


nbody 


421 


40 


20 


20 


80 


tsp 


162 


30 


20 


10 


60 


eliza 


432 


50 


20 





70 


elex_scanner 


165 


20 


10 





30 


read 


434 


40 


20 


10 


70 


robot 


165 


10 


10 





20 


simple_analyzer 


512 


90 


701 


20 


811 


sorts 


172 





10 


10 


20 


ann 


547 


50 


30 


10 


90 


cs2 


175 


30 


10 


10 


50 


diffsimpsv 


681 


61 


100 





161 


sec 


175 


10 


141 





151 


archl 


692 


50 


40 


10 


100 


bpO-6 


201 


20 


10 





30 


asm 


800 


60 


40 


30 


130 


bnet 


205 


20 


20 





40 


poker 


962 


81 


70 


10 


161 


jons 


222 


40 





10 


50 


pentomino 


981 


50 


40 


80 


170 


mathlib 


226 


10 


10 





20 


chat 


1037 


411 


1422 


1082 


2915 


intervals 


230 


20 


10 


10 


40 


sim_v5-2 


1308 


80 


70 





150 


barnesjiut 


240 


40 


30 


40 


110 


semigroup 


2328 


180 


90 


60 


350 



Table 3. Speed of the Mode Analysis (medium-scale benchmarks) 



in one module depends on a predicate defined in another module. Abductive analysis 
is related to the backward analysis presented in this paper since abduction is the 
inverse image of a forward semantics whereas pseudo-complement is the inverse 
image of conjunction - the basic computational step in forward (and backward) 
semantics. 

The termination inference engine of (Gcnaim & Codish, 2001) decomposes the 
cTI analyser of ( Mesnard, 1996| ) into two components: a termination checker (Codish 



Taboch, 1999) and the backward analysis described in this paper. First, the ter- 



mination inference engine computes a set of binary clauses which describe possible 
loops in the program with size relations. Second, a Boolean function is inferred 
for each predicate that describes moding conditions sufficient for each loop to only 
be executed a finite number of times. Third, the backward analysis described in 
this paper is applied to infer initial modes by calculating a greatest fixpoint which 
guarantee that the moding conditions hold and thereby assure termination. In- 
terestingly, the cTI analyser involves a /i-calculus solver to compute the greatest 
fixpoint of an equivalent (though more complex) system of equations. This seems 
to suggest that greatest fixpoints are important in backward analysis. 

Cousot and Cousot (Cousot & Cousot, 1992) explain how a backward collect- 
ing semantics can be deployed to precisely characterise states that arise in finite 
SLD-derivations. First, they present a forward collecting semantics that records the 
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descendant states that arise from a set of initial states. Second, they present a dual 
(backward) collecting semantics that records those states which occur as ascendant 
states of the final states. By combining both semantics, they characterise the set 
of descendant states of the initial states which are also ascendant states of the fi- 
nal states of the transition system. This use of backward analysis is primarily as 
a device to improve the precision of a classic goal-dependent analysis. Our work is 
more radical in the sense that it shows how a bottom-up analysis performed in a 
backward fashion, can be used to characterise initial queries. Moreover it is used 
for lower approximation rather than upper approximation. 

Mazur, Janssens and Bruynooghe (Mazur et at, 200C) present a kind of ad hoc 
backward analysis to derive reuse conditions from a goal-independent reuse analysis 
for Mercury ( Bomogyi et ai, 1996| ) . The analysis propagates reuse information from 
a point where a structure is decomposed in a clause to the point where the clause 
is invoked in its parent clause. This is similar in spirit to how demand is passed 
from a callee to a caller in the backward analysis described in this paper. However, 
the reuse analysis does not propagate information right-to-left across a clause using 
pseudo-complement, and so one interesting topic for future work will to be relate 
these two analyses. Another matter for future work, will be to investigate the extent 
to which our backward mode analysis can be reconstructed by inverting abstract 
functions (Hughes fc Launchbury, 1994). 



8 Conclusion 

We have shown how abstract interpretation, and specifically a backward analysis, 
can infer moding properties which if satisfied by the initial query, come with the 
guarantee that the program and query cannot generate instantiation errors. Back- 
ward analysis has other applications in termination inference and also in inferring 
queries for which the builtins called from within the program behave predictably in 
the presence of rational trees. The analysis is composed of two bottom-up fixpoint 
calculations, a lfp and a gfp, both of which are straightforward to implement. The 
lfp characterises success patterns. The gfp, uses these success patterns to infer safe 
initial calling patterns. It propagates moding requirements right-to- left, against the 
control-flow, using the pseudo-complement operator. This operator fits with back- 
ward analysis since it enables moding requirements to be minimised (maximally 
weakened) in right-to-left propagation. This operator, however, requires that the 
computational domain be closed under Heyting completion (or equivalently con- 
dense). This requirement seems reasonable because disjunctive dependencies occur 
frequently in right-to-left propagation and therefore significant precision would be 
lost if the requirement were relaxed. Experimental evaluation has demonstrated that 
the analysis is practical in the sense that it can infer calling modes for medium- 
scaled programs. Finally, our work adds weight to the belief that condensing is an 
important property in the analysis of logic programs. 
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A Proof appendix 



Proof for proposition 3. 1 

Proof by induction. Let I = 0, I' Q = 0, I k+1 = T c P {I k ) and f k+1 = T^\P k ). To 
show {{Ik) E I' k since then it follows that |(lfp(.Fp)) = |(U fceN / fe ) C U fceN j(4) C 

U fceN /[, = lfp^^). The base case is trivial so suppose i(Ik) E I'k- 
Let :-c']~ G ifc+i. Then there exists p(x) :- c,px(x\), . . . ,p n (x n ) G P and 

{[Pi(xi) :-Ci]~}" =1 C I k such that c' = c <8> (g^B^Ci). Observe that |(c') C 
|(c) n n^J^- (c,)) C |(c) n r\™ =1 3 £t (|(ci)). But by the inductive hypothesis, there 
exist {[pj(xi) :-c-]~}™ =1 C i£ such that |(cj) C c-. Hence [p(x) :-c"]~ G such 
that J,(c') C c" so that I k +i E , 1 and the result follows. □ 



Proof for theorem 3.S 



The proof tactic is analogous to that used for proposition 3.1. □ 



Proof for corollary 3.1 



Let C be a semi-cylindric constraint system and p G uco(p^(C)) be a semi-morphism. 
By proposition |1] it follows that 1{T C (P)) E .T 7 ^ ( j(P)) and hence j o(|(J rC (P))) C 
p (jtpHc)(^p))) and by theorem |I| /9 (^f ji ( c )(|(P))) C ^^(/^(P))) and so the 
result follows. □ 



Proof for proposition 5. 1 



Let D n+ i C D„ for all n G N. Put E n = U{D t I > n} and E = C\{E n \ 
n G N}. Since D n+ i C _D„ observe that E n = D n for all n G N and hence 

P P ' c (n{[Aj s I « e N}) = P£ c (n{[£„], | n e N}) = 2?£ c ([i?W = [^' c (£)]^ = 
[n{D P ' c (£; n ) | n e N}]= = n{[^ c (s„)], «eN} = n{v p f ([£„]=) | neN} = 
n{K£ c ([£> n ]=) | n G N}. □ 



Proof for lemma 5.1 



Proof by (double) induction. Let (p(y); c"; 1) = s\ s n 0, = (gi]Ci;hi) 
and suppose (£>£ C ) fe (T) = [D fc ]=. The outer induction is on fc. 

fease case: Suppose max({max(/ii) 1 < i < n}) < 1 and [p(y) :-e}~ G Di. Thus 
si so that si =>p and hence there exists p(x') :- C o c' , g 1 G P such that 
<9f'(%(c")) g C". Then [p(f) :- e% G L>i where 3*0**,* ® e) = 3 z -(d z -, F ® e') and 
var(z) n (var(y) U FV(e) U var(f') U FV(e')) = 0. Observe that e! C Vp(C') and 
thus_a|(%(e)) = 3^(e') Cj^V^ (<?')) = Vy(C') C C Hence df (%(c")) 
<9| (3ff(e)) so that 3j;(c") 3^(e) as required. 

inductive case: Suppose k = max({max(/ii) | 1 < i < n}) > 1 and [p(y) :-e]~ G D k . 
Suppose, for the sake of a contradiction, that 3fj(c") G 3^(e). Since k > 1 there 
exists u; = p(x) :- C o c,pi{x\), . . . ,pi(xi) G P, <p G -Ren such that </?(CLP(u;)) = 
p(f') :-d,p 1 {x' 1 ),...,pi(x' l ) < Sl CLP(P) and s 2 = (pi(fi), . . . ,pi{x{)\ c[; 2 l ) and 
c'j = c" ® ® c'. Suppose (pi(xi); ci) (e;c' 2 ), (p m (x' m ); c' m ) ^>* P 0- 
Without loss of generality assume FV(CLP(w))nFV(c^) = for all i G [1, to]. Let 
v = x-xi ■ ■ ■ xi and v' — x' -x[ ■ ■ ■ x\. Let g\ G C such that (pi(a;^); 1) -^p (e; g^) and 
c^ +1 = c \®g[ for all i G [1, to). For all i G [1, to), put g { = dp\g[). Put ci = «9f,(ci) 

and for all i G [2,m], put c^ = dp^c'A. Then c^+i = Ci <S> g% for all i G 
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Let O c {P) = [F\=. By proposition |£| \pi(xi) = [pi(^) G F for all 

i G [l,m). By theorem £> C (P) = .P C (P) and by corollary [T| pQC^CP))) C 
J" cod ^)(p(|(P))). Thus for i e [l,m) there exists [pi(xi) > G P such that 
pQ(Si)) C /j. Put fi = for all j G [to,/] to ensure [pi(xi) :- G P for 
all z G [to,?]. Let [pi(xi) :- di]~ G P& for all i G [!,?]• Finally put e n +i = C, 
e, — did (fi e i+ i) for all i G [1,2] and eo = C (~l (p(|(c)) — >' ei). The inner 
induction is on i and is used to show p(j(cj)) C a for all i G [1, m]. 

base case: Now c x = d$(c[) < df(3 f (c")) G C. Thus p(j(ci)) C C. Furthermore, 
ci = a|,(ci) ^d$,(c') = c. Thus p(|(ci)) C p(I(c)). Moreover, ci < 9|(%(c")) G 
9|(%(e)) C Vx(e ) C eo. Thus p(|(ci)) C e . However, e = Cn(pU(c)) -V ei). 
Thusp^d)) C Cn(p(|(c)) -V ei) andpCKd)) C p(j( Cl ))nCn(p(j(c)) -V ei ) 

- n (p(j( c )) ei ) = p(I( Ci )) n P (|(c)) n ( P (j(c)) ei ) - P (i( Cl )) n 

P(i( c )) H ei = /o(|(ci)) n e\. Therefore p(|(ci)) C e x as required. 
inductive case: Suppose /o(j(cj)) C <p(ei). Now /j(j(cj+i)) = p(|(cj ® gj)) 

c p(^ci)) n p(|( ffl )) c et n p(|( 9i )) c ei n P {i{ 9l )) c ei nK (/< -V e l+1 ) n /< 

= ej+i. Therefore p(J,(cj_|_i)) C gj+i as required. 

Thus p(|(c m )) C e m C d m so that Let <4, = ^ m (3^ m (d m )) and 

observe that [p m (4n) : " rf m]~ = [Pm{x m ) --d m ]~ G Pfc. Put c" n = 9^(3^ m (c m )) 
so that c' m < c" n £ d' m . By the inductive hypothesis (p m (x' m ); c' m ) ^>p which is 
a contradiction and hence 3jj(c") 3^(e) as required. 

The result follows. □ 



Proof for theorem 5A 

Let VP' C (P) = [£>]=, [p(y) :-e]„ G P and c" G %(e). Thus 3 f (c") G %(%(e)) = 
3j;(e). Suppose, for the sake of a contradiction, that (p(y);c"; 1} = si s„ ^p(> 
where s, = [g^c^hi). Let fc = TOax({max(/ii) | 1 < i < n}). Suppose (Dp ) fe (T) = 



[Pfe]=. Since D C P& and by lemma 5.1 there exists [p(y) :-e']~ G Pfe such that 



3jf(c") 3j(e'). Since 3y(e) C 3j?(e') it follows that 3j?(c") ^ 3j?(e) which is a 
contradiction. The result follows. □ 



